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lii' Parts I and II (Hannari 1978a/b) we have argued that "sociolci'giqalv ' 

- • . ■ ' • V" ' ■ ' ^ ' - a ■ ' * ■ • * - 

lanies iri ''metric-. Variables should J b€ 



stadias of change 



ds raodteled as :Stochas 



differential equations.' 'We^have-' shown . ha tv at least} for liha&f^DW s f wi 




can obtain explicit solutions for probability densities 

* • - ■ ' ■■ 1 * • . ' . V" * 

no obvious, impadiments^^ istimating the parameters of \ such ttodels frpm 

conventional panel 7 data/ We now', tutn vatt^erxtidn to the ^rtctical ^ details . = ! b 

of this strategy. But we, cannot borrow indirectly from the technical 

^literature . Whi¥^ the problem of estimating SDE* s has bken widely ... . 
1 



studied* almoit all work has considered time serie^de^igns. ] Since\w^ 
focus on panel analysis, we -must modify Xhe usual strategies . 

. We begin* by considering tw/o broad approaches to empirical study of 
SDE 1 s . s One involves estimating structural parameters directly, from 



integral equations; the second uses discrete, approximations^. 'We ar t g 



ue fori 



the former strategy angl .outline obvidus maximum likelihood estimation 

* i -* * * \ •> ■ 

approach in* the panel context* 'In any realistic application\of the 
method^ we propose, dis turbances' will, be autbcorrelated,, The\problem df 

- , ■ . * ■ ; t ■ ■ 1 

autocorrelation stands as the ma j or obstacle' to sound Ihferencfe concerning 

* * \ 

*■ * * \ 

dynamic models,, We thus devote considerable attention" to' this complication^ \' 

: ■ ' ' 1 > ' ' . " ■'. V V 

especially in the v context of estimation from pooled cross-section and 
time series designs, 



\ 



After presenting the large sample estimation theory^ we turn no ! > 
Monte Carlo evidence on the small Sample properties of the. pooled, estimators 
we use,. In particular we contrast the performance, of maximum likelihood 
arid generalized least squares estimators* ' 

J In the final section we raise a very important practical -problem: • 
u n e q ua 1 s pa c°i n g of obs ejrva t i o ns ■ Virtually all meGhodologiealO^rk on 




panel analysis MSsumes mat* data affe Equally- spaced in /time. • ?fe sdggeit V 7 ^' 



'that sociologists often obtain data with iraueh lias regular 'spacing T ; /thus \ 
it is important y to extend our strategy t to Such cases* . One^f the ; toain v. ^ 



advantage! of 'using continuous time mode jls is that' they* pertrtit r sygfcani^tie 

: r .">■>:"■. t . .. * •• - ■-. : ;.> .' : ',[: {''■*. !, = =-v; '.^ -V \ '= ' • 

treatment of 'unequally s£acdd data.. We "illustrate this 'advantage-, an4 . 

■ •" ' ' : " : - ' ' /" \"- •• ■ .'. 

how maximum likelihood estimators may be adapted to handle unequal spacing- 

vrt, ■■ <: ': v; :: ^v4'f-;';'':|' 

1 Two Approaches ; . ■• ; ' 9 % • ■'. ' * "'"G " ..." v 

... We begin with broad strategic considerations," Analysis of integral : 
Aequtft ions' for the purpose of ^obtaining ; estimates' of -dynamic parameter^ as 

"Vv. ■ ■■};,;/. \ V ./;J":--. : : : \"}> ■■> : .--va : v ; V V X 

cetch^d in (Hainan "( 1978a): has obvious appeal . This strategy highlights 



the conneatipn be*t^aien ,the mathematical model and the statistical* fllodei* 



\ 



It ilso permit5';use of standard; ^estimation techniques^ The strategy has 



at 1ms t. one drawback that in sotne circumstances limits its 'value sharply. . 
In- the\case of systems of equations, it 'is ' difficult to Impose constraints, 
on parameters in estimation. Suppose, for example, that theory implies 
that one -bf-^ the entries in B , the matrix of parameters of the endogenous 
portion ofl the system, is , zero, 1 The eigenvaiue-e^igenveet^or approach for 

using data \o generate je'stimates ^of E does npt permit us to, use ,this 

' . V :,. •••• \ > } . ■■ ' > 

constraint liL estimation* One consequence is non-zero estimates of the 

\ - ■ ■ ■ ■■(." ,;, ., : 

parameter ^'known' 1 ,to be zero. Moreover, estimates of other parameters 
'Will be less ? than fully efficient* The latter is just- an instance of 
general rule that" consistent* estimators "\ that ignore constraints ^ave 

•• ,•: ■ ' ' . . ■ 

larger variances than' o^ther^ consistent estimators that use' them* 
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This limitation has motivated statisticians ecohometr icians to 

seek -more e»f f iclxerit estimators* Most attention has foe used ^on ^o-called 
"exact discnetp a p p r o x ima t i o rt s- ! ' .to stochastic differential equations* 

. : / ' ., * . " ' ■;. /; - ' , . • 

'This stra tegyAreplaces tl}e continuous- time model 'with : a -, discrete- time 



> ./■:■ 



analogue, Th^ 
J ponver ijf -the 
> of .thill Itji^t^ 



0n# doien/ndk gL^ 
estiinataa z^^fei ^is^eplEP 



|s a parameter &ha,i ^ when made sufficiently ssia.il, 
/to the proper cfehtinuQHS-time model. The advantage 
Constraints pri parameters insy be employed routinely. 



Jretaly on continuous-time modeling since the 
ixed Mu approximation to the continuous -time 



ExaciJ discrete'^&roxiAacion estimators have been developed and applied 



to L time series ^ata x-ige^iergstrQm 19760 but not, to ou^r knowledge, to 
panel data* * Ektenreforre to panel applications appear not to be trivial. 
We do not^' attem|Tt: siiqh an ex tension l^er'e but continue to^focus on the 
strategy ,of Rising inteAal 'equation!, directly, ^hi most important 

. ' frj ' . ' " 7 •. ^ 

consideration- in^ this COTice concerns the spacing of observations, The 
,f exae% discrete ?• approximation" approach requires "equally spaced observations * 
Thus they dp^ not apply to _ wide classes of sociological research. 



■ Sp We ,choos4/tp retain generality at tihe cost of efficiency in 



/ 

estimation. As lorfg-^as 



we. use reasonably large samp 1 , the price should 



not be too high. However, when systems of equations are to be estimated 

■ " * • 

from equally spaced data oh small sample's, it is worth investigating the^: 



alternative approach* ^We do not pursue this problem here. 



2 . _ Single* Equation Models % ■ 8 

In Part Vl we created the following simple extension of the OU procesa 



4Y(t) 



adt + bY(£)dt' + cXdt ■+ adf 



b < 0 



CD 



where ff3.* is normal Brownian motion , This model may be ..considered as 
a stochastic negative feedback/ or linear partial adjustment model with a 

\ - * ' * 

single (constant) exogenous ^variable, As we indicated in the last chapter, 



.this has solution (with initial condition Y(0) - Y n ) 



Y(t) - f i$£&}). V + f (a Mt -l)X + ^(t) , V / (2) «' 



where 4 / '•• * -.' /V-i 



We rewrite (2) s with obvious substitutions, as _ . • 

'••■'. *' .*'•"' . ^ ■ s 

Y(t) »;-'a* + b*y Q + c X +" 6(t) ^ ; ' te . 

* V * + 

Suppose that observations "on N entities at times 0 and t are available* If 

■ '. ■ , * •* , • t 

all N units follow the same process, we may write v 

Y.(fc) = a* + b^Y^CO) + c*X. + £.(t) (i = [1? • • • J 1 _N) (4) 



As long as the disturbance process dS t -, f M independent from observation 

■* \r ■■ ' 

to observation, i.e., that the £. (t) are independent and identically 

■ 1 |- . ■ i 

distributed, this model^may be analyzed by ordinary .least squares (OLS) , * 

* * K j • ~ " ' " ■ ' 

In faefcu under the generalized Gauss-Marltov theorem OLS is a best linear 

unbiased (BLUE) estimator of a , b , c , and a . That is, among the class 

of linear unbiased estimators it has smallest variance. Since the 



disturbances' are normally distributed, OLS is identical to maximum ^likelihood 



This f ac t ' considerably simplifies the problem of using a , b , c to 

• • • * C 

estimate the structural parameters a, b, and c, # , 

Maximum likelihood estimators have^a^tton^invariance property: 
mono tonic (i.e., o r d e r- p r e s e r v i n g ) f un^tions\of: MLE are also MLE . By ^ 
^comparing b and b in (2) and ( 4 )* . we see 



bAt * 
e - b 



At 



1 



. (5) 



V 



Consequently € In 6 is a maximum likelihood estimator as loivg as 



U is ML. . This fact does not hold generally for leas 



it* s^uar 



es estimators 



(i.a* , when OLS 



\p nO|t 



ML) ; Least squares estimators rstain\c;bnsistency 
- ' $ 

under nonlinear transformations but lose^asymp totic ef f icienc 
, s In much substantive work the exogenous variables wilt not be 
constant over time, - The solution^ of the, differential equation- involves 

* j £ f 

terms of the *f orm (fee ffannan 1978a): 



"0 y f(s)ds 



where f (s) some function, of t^ime . * Coleman (1^6 8) remarked Dhat common 
scientific practice when f ( sji* is... unknown is* to assume that the 
exogenous variable (s ) ch ang e 1 ine a r ly over t\me > ( Then the integral 



J} 



equation has^jEhe form (see 10,14): * 

; f Y(t) -a* + b*Y Q "W- c*X Q + £ 2 *:&XXt )> + £(t 
which may he analyzed by 'OLS or ML. Of course/ other hypotheses concern- 
ing thef dynamic behavior of exogenous Variables give dj.fferenf -"estimation ft 



equations. This matter may-rfot be ^treated mechanically. . After all, the | \, 
dynamics.of the outcome variable musc^iepend on the d^mamics of the*' forcing y 



amies 



variables . Unless we specify the latter well^we carinot have much hope of 



.doing a good j'ob with the former; ' Nonetheless, for .simplicity we focus 
: attention on the simplest assumptions concerning £(3)** i'- 



3 * "\ Sys terns of Equations s 



Next consider the Ample linear system 



V dY^t) Va^t + ^Y^tjdt +.b 1 - 2 Y 2 (t)^V t ... +.b^Ut)4t 



c Xdt +M3 rf 



V 
■ I ^ 



4= c Xdt 
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7 



where tlu V \t' k ™ 1 ' " * >,1^K^are independent normal^ Browni»n motion processes. 
* We express this systfem more compactly as 



*ad*<& # By(t) + ex + ZdS^ .. ' (7) 

- ■ ' ■ ' ~ . " . \ . • 1 * 

where E ■ al, I is a K fey K identity matrix. In Part -.II (Hannan 1978b) , we 
saw that (7) cahj>e integrated with initial '.condition x(O)- - jTq, to yield 

• ■ -* '. . ; • " ' ' - • " ; : : ^ 



I 



■ " ' V r 

where I is a K by K identity matrix. Since the disturbances are independent 

J . Browfiian mo tions ytne _e(t) has a simple structure > Each t) is * 

■» • -2 / i * 

^ (l^e 2b k-f) and E [e . (t)' e , Gt) ] . f 0 for k ^ j .. Thus the linear 

- ■ system . } # \ 



1 2(t) - a* + B Ytfi £ x .+ eU) . , (9) 



4 



has independent' disturbances , It is, in f*aet a recursive system (we .do 

„ 1 

not need "simultaneous equations estimatprs f, ) . Again OLS and ML are 



\ . ' - " c* 

s equivalent . 'They give asymptotically urfbiased anH* efficient estimators 

• * .- & !* ' ? • i> 

\ . . •> * * * ,.* , s 

,\ of* a' , B ? c f and _o_ . , . . _ . 



> 



w? There|^rb we ;may employ the, eigenvalue-eigenvector method of Pajt I, Section '5 



^ . tofcs timate* 5 B from B It is ^ then .a simple, problem in algebra to recover 



a* and £ from S* s S^, and J, We may also, if we wish , oklculatd standard Sif^: 

- * " . - ., r • • 

errors of these parameter vectors from the estimated 3*tandaxd errors of 



. _ and- 5 \ All these calculation may be done With any of a set 

; - K - " : - * ^ . K 

ot widely available computer roucines for extracting eigenvectors and 

^eigenvalues * - - ■ 



\ 
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3 ' 

if / system 



The/ system case pbB'es^ only *dna niw inference issue > raised at £he - 
.. outlet The procedure ye use, does not^empjoy constraints ^>n parameters in%B 

_: ■ _ _J " ____ __L . __ ___. , ______._. h\ ____'____._____ " • i '•_ ___ ___= _ _____ 



? 1 ■ IC is noft as a result, fully- efficient when such constraints are appropriate , 

. 7 . . •'• ' * h ' 

4, Autocorrelation of Disturbances * 

„ _ ^_ ^ — s — _ — _____ — __ , f ^ 

. • ,'_r * ' We \jave not yet mentioned the practical complication that pervades most 

* ' , ( ✓ . w 

discussions of temporal farial-y|is: autocorrelation of . disturbances . Wide 



experience reveals that factors omitted from our models are at least 



moderately stable over time. As a result disturbance-S tend to be correlated 

*• • " • ■ • , , ' ■ • 

over time., When disturbances are autocorrelated- (i s e M correlated over time), 



*1 



the effects of omitted variables are usuajjj/ confounded with the, effects of 

/ •; • • ■ - s '• ' , . 

the lagged dependent valuables (s) , Y A , This ie a standard problem in panel 1 

..at • ■ I' ■ 

analysis . Unless autQeorrei-atiQn is handled properly, we will not obtain 

good estimates of dynamic parameters, The problem is particularly severe 

with continuous^ time models in wiiich^ ail parameter -estimates depend on'IS- 

\ ■ • >" ' % ' 

since autocorrelatiori, particularly af fects e^sttimates of B (see Johnston 

1972; Hannan and Young 1977), 

Although we suspect autocorrelation i& ^ractLc^^ applica- 

t ions j tha^mo^J,, developed ,so far does nj#fc reflect this, Recall 
that is a Brawnian motion process , with' independent increments, Thus 

the increment^ in suteess iv_ periods are independent ^ ( this is not true 
in general ofi Markov processes as we remarked . in Part III So we must * 



modify th? modal "f^wfi aref to deal wi£h the autocorrelation problem in 
a systematic manrfer, 'We u t . _wy - 6 1 ratagiea * ^ Iu_ firdt invoh__ * 

complicating the random forcing function f relaxing the independent 
incrememts assumption.'* The » alternative' is to introduce individual-specific 
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V 



V 



« table Individual characteristics ) into the* 



parameters' (affects of «tab-le individual characteristics) into the' 
stoenasHc-dlf ferantial-equation.^— — — 



Of course the two strategies may be combined, * put it is useful to 

•■ ' 

contrast the substantive interpretations that fit one or the^other . 
The strategy .of introducing unit-specific effects fits weljL those circum- 
stances in which the omitted causal variables are approximately constant 
over -the study period, jWhen the omi t t^^ar i ab 1 qs^ch ang e greatly over the ' 
study period, the alternative procecTSie is called for. Our, sub strati ve 

* 

work has relied on the unit-specific effect approach. Let us begin by 

_j ■ . ■ • 

out-lining this approach. Once we have done so, we . will be in a better . „ . 
position to clarify the nature of the alternative procedure. 

5 , . Unit -Specific Effects ■ : . ■ , ■ / - 

Suppose that the -N units, under study change according to the, same general 
process (7) But that each unit has a dis t*inct "constant" rate change, 
In the study of Individual 'careers these constants might include 
physiological characteristics (e.g, , energy , levels,) , enduring features 
of personality, status origins, ethnicity^, linguistic styles, etc, 
HCn studies of organizations, they would include material infrastructure ** 
(e,g f , characteristics of physical arrangeirfents ) , statfle features of worfe 
technology, long- standing political alliances^ cultural attributes of 
members, etc. For each unit, suinmari^e the effects of all such stable 
omit. ted variables as a s i(igle quant it JL^ ik the i unit la the k 

equation). In other words, each ,unit has its .own dynamic process (due to 



the m^) but the remaining parameters are constrained to be the same for all 



Units* Thus we. must consider the svstem of NK equations: 

J ' * ■ I - 



dY li 



. dY 



IN 




dY 



KN 



• i 















■ ,» i 


B : * 


% 


b n * * 1 


- 


dt + : 


* = * 


dt + * 


• 




fw 


J"rL ' " • m KN_ 




I.",-*, 

*> K1 . • ••* 









■ • 


dt + 


* 








Y K1 ■'■ " 1 Y KN _ 




_ ( 'k_ 



dt + 



'IK 



.' b 




dp 



d6 



IN 



(10) 



' 1 > 

or, equivalently : ' 

■ ■ • " -• . 

.< ^ 
dY(t) > ai'dt + Mdt BY(t)dt + cx- f dt + , Edjfr 

'/"""■■ ^ 

." • « ' ■■ ■ • * 

'where j ' is an 1 by N vector of ones. , ' 

• 4?- . - ; . - 

The system of equations in ,(11) has so lut ion (with initial con-' 
dition Y<t Q ) = Yq);: ~ ' 

|(t). jf W i^'-V * - I) + B" 1 W + E(t) , 

As before we write (12) more- compactly as ' '> 

'*' I(t) - a*i' + M* + B* Yq + c*x' + K ( i ) 

And this model differs 'from (9) due only, to the presence of M * 



(11) 



(12) 



f 

i 



(13) 



Suppose the model in (11), is torrect but the . afialyst -ignores tjie 
uhobservable variables whose effects 'are summarized in M * That is, he * 

- ' ._,. y .. " - ,^ — - ^— 



estimates - \ ' 



3 i*i f + i*y q + u ( o . / . s (14) 

where U(t) = M - + E.(.f) . - ^ (15) 



1 



•It «,is easy to show that OLS gives biased estimates of B f a , and £ . Since 



the factors in M are constants they affect Y at all times including Y ( tj!) 



, * - m * =' ■ * ■ ', .■ ■ 

Thus M must be correlated with Y„. , .. 

•, , - - * ' ~*0 * - * « 

Consequently OLS '"gives credit" to for the effects" in M This gives 

* « - .*. -" _ * 

biased estimates of B , and thus of Ifil the parameters i of the model* 



And this bias is usually substantial as we illustrate below, " ^ * 

* * * - ■ •' , - = ;- - \ 

- This is an instance of the classic autocorrelation problem raised- 

- . " : A * ' sT ' ' *' 

earlier* When' the effects ifi M are ignored and thereby farted into*, 

* * / 

^ the dlsturbancej the* latter Bwst become 'positively au^tocorrelated , ' 

* -i * ■ * * » * 

Failure tp acknowledge this,, i.e. , using estimators that assume U(t), 

is uncorreiated .with Y Q , leads t o ^ b iascd- ps t ima t ion „ ^he usual - 

"two-wave" panel 4 oas not contain enougrrM^fibrmation for this 1 

autocorrelation problem to be corrected . Bui: as t long as the 

effects of omitted variables are constant (M(t) - M) 5 this autocorrelation 

* § , 

problem is easily handjed by a change in research, design. ' 1 

; i» ' \, 

- ^ * d . 

6 . Poo-led Cross -Sect ion and Time Serres Es timator . • \ 

— ' «t 

Biometricians (Henderson 1352) and econometric ians C$£uh 1959; ,8alestra and 
Ner love 1968.) have proposed", est imators for such models in, a discrete'- time 
framework, . Hannar^ and Freeman (19*78) \pplied similar estimators to a * * 

continuous-- time model, "'Before cL|jp€ussing the estimators s we must address? 

i ' f * - . * 

? a broad methodological issue- whether the uttit-spec if ic components are 

■ ; "■ ■ . -' ; s . , r ; 

cour-Uderfid fixed or random effects. * 1 J.-- 



12 



As Searle (1970) notes, >the-" fixed effect perspective fits situations 
in which all the interest attaches to the units under study"" and rio effort 

* 7 • fc <* +*• -\ . • 

will- be made to generalize findings to 'other • units . Then the m , are don- 
. * * ij 



sidered a set of NK parameters to he estimated* Mien the units studied tfre-* 



chosen to represent ifome broader class x?f units (Le, , some population^ of 

* ■ : />* ' - '■ : 

u. its) j the random effects pe * specfiiyer is appropriate* Then the proper .strategy 

,a ■ mo b~ t - * .general distribution of* unit-specific eff ects ^artd , to treaf t^hose in 

■ 1- a. iiiv.a-ic s of' Jhe general process generating 'unit eff ec taV" ./Then interest 

: . ^ • s , ' . in- th" parameters , not of the units, but of the distribution 

. i ;> i ,;t: The di tribution of the m . typically involves far fewer than 

** i - i J * 

\ ■=";:: •.{■- 4 ■■ . .-: , . I'sual y we - assume that the population distribution is normal* 
he,*. . r c. : • ! ■ t 1 * specified by two parameters, the mean and variance* 



oice etween the fixed and random perspectives is usually discussed 
; *. a = . - Omental .'esign context. Consider for example the income maintenance 

" in. t\t discussed in earlier chapters* We Implemented three levels of income 
S'\ p r .■ * d four tax rates* For example, we use tax rates of 50% , and 80%* 

. ■■ t.ere were b scientific or policy interest in any other tax rates, a fixed 
. f f ■ cts model would be appropriate. However, we wish to generalize findings 

to other tax rates, e.g., 601; thus we adopt a random effects perspective, 

V - * : 

But when interest focuses on discrete alternatives, e,g*, research on the effec- 
tiveness of several qualitatively different; organizational design programs, 
rehabilitation programs, a fixed effects framework may often be more appropriate* 

In this chapter we consider effects of unobserved variables. Should these 
be treated a=s fixed or random? Since we cannot even 



W 



v. 
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enumerate the factors whose effects are summarized in m_ it seems 
awkward to treat these as f^ced effects. One might: still argue that the 
, units were chosen because they have some ^unmeasured) properties of special 
scientific interest and that these properties are summarized in m . ^ ^ 



t it- 



Y it " V 1 + b ; - V + V " V (17) 

and apply ordinary least squares , ^ffitler the assumptions of the dynamic 



f So the choice among the two perspectives appears once again to turn on the question 
of whether the units were chosen to be representative of some broad 
class or whether they were selected because they have some very 
distinctive property. We suspect that most empirical research in the 
social sciences comes closer to the former than the latter. If so 5 
the random-effects model is more generally applicable. , We have 
focused on this model in our substantive research. Nonetheless we grant 
that both models have social science utility and we discuss estimators 
from each perspective. 

It suffices to consider only single equation models as we noted above. 
Suppose we have measurements on the stochastic process \Y_^ (t )] at times 
tp, t^, .* . , t T and assume that the same stochastic differential equation (1) 
generates all the observations. We specify the following pooled model: 
Y lt -•* + <+ bX ,..! + e it 

(16) 

(i 1 N; ' 1 , • • • ,T) 

7 . Fixed Effect Esti mators 

When the (i - L,...,N) are considered fixed parameters, estimation 
of (16) is simple. As long as T 2 3, we merely add dummy variables 
for each unit (i.e., variables that are unity for observations in- the unit 
and zero for observations on all other units). Alternatively we may 
ey.pr o > ii observations as do v la t ion i from unit means (where Y if = 1Y 
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model, these OLS. estimators are again maximum likelihood 
estimators are ■ asymptotically unbiased and efficient. 

Note that the constant exogenous variable has been lost in (17)* 
4 In Zh\& pooled M w i. thin-unit" regression, one cannot estimate both unit- 

parameters and the effects of exogenous variables that do not vary over 
time, = We do not. face' such a limitation in the random effects model, where 
c can indeed be est^imaced. 

The ,m?" are recovered from estimates a'\ (5* of (17). by straight- 
forward algebraic operations (see Sear le *( 19 70) ) . These are 
generalisations of the procedures used co recover the intercept 
in a conventional linear regression of variables taken as deviations from 
:he (grand) mean.. .So pnce we have chosen Che proper design, the pooled multi-wave 
* ' model, no new issues arise in estimating the fixed effects model for 

k unit-specific rates of -change . 

8, Random Effects E's't. uiuiors - 

The alternative perspective considers the m. Co be random variables 
over units but constants over time. The usual specification is that 
the m. are independent and identically distribution from. a. normal * g. 

distribution with mean zero and variance c^*. We further assume — 



r 



E (m* X.,"] 0 all I, i 1 

E^m* 6 , ] 0 all i, i', t 

v ' 2 
Then the 'm havo the sanit> properties but: are transformed from N(0,<:^) 

i J 
to N(0-,c^) where - ) . 

- Since the M* are unobserved random variables, they *may be considered 

lis 1 

a component of the disturbance for purposes of estimation. To emphasize 
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this fact we write the model as 

Y = a * + b*Y. . + c'x, + u. - (18) 

it " . i,t-l i it ■ 

u . = .m + t 
it i it 

Under our assumptions the disturbance, u^, has a normal distribution witn 



mean zero and coyariance structure: 

r 2. 2 

*E u_ u. f . - \ + **) 

it i t 1 ro ' 



m 



i f i - i ' and t = t 
if i = i f , t ft' 

if i # V 



(19) 



2 , cT ? t b/U 
where cx^ is (1-e ) 



If we arrange observations in (18) so that the first T are from unit 1 , 
^the next T from unit '2, etc. The, var iance-covar iance matrix of disturb 

i 

has the simple block diagonal rorm: 



ances 



u f ) 



(20) 



T 2 2 

with V - (O ie + r J . ) and each block has the structure: 
S m 



and P 



S ^ 



^ ,. I ( <" >■■ + 



P . 



(21) 



(22) 



(23) 



Note. that p is the proportion of "error variance" that is unit-specific, 
That is, it may be considered a measure of the importance of the unit- 
specific effects relative to the Brownian motion noise process, The 
parameter p is called the autocorrelation coefficient for the unit-specific 
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15 ■' • 

' * •' , ' ■ ■ ... 

affects model. The simple model we are considering holds that units, are 

homogenous in £he sense that P is constant Q y ^ r unfts. 

Before considering estimators, we consider the systems case. For sim= 

plicity we continue to focus on the case where x is constant. The model may 

be written: * * 

X t " 2 t X + H t ' f (24), 

where 

z t = (y n » y 12 " ■ ■ » y rr »" " » %1 ? ' ' ' » ^NT 5 ' 
*t-i " (7 io j y u'---' y i,T-i i " ' y m^-\ y ^ l )i 

K x - (X 1L , X U X 1L »-/. » X 1N X 1N .) 

W ' 

x it = ( x j i ■ x ji ' " ' ' » x jr " 1 " ' x JN ' ' " " x jn) ' 

. Q t " ( ' • ^ t i x [ x j ) (where i is an NT x 1 vectors of ones) 

4 Sfc " (u U' u i:>'"' u lt> ••' U N! u m y 

and 

* & * 

Y = (a , b , c l p - . . » Cj ) 1 

At this point it is natural to search for a consistent estimator 
which avoids the problem in the disturbances. The existence of such an 
estimator is suggested by the fact that we can transform (24) in such 
a way as to produce "we L 1 -behaved", d isturbance s . What w§ need is, to find 
a matrix Q which when applied to ( 24) yields 



it at* — -t 

1/2 ,-,-1/2, 
" i u ' . 
-t-t- 



El^Tuu'," /b ] - 2' 1 " 4 (263 
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• . » '16 

Nothing in the ^causal structure has been changed and we can apply 
ordinary least squares to -(25) , Because of (26) 0L$. applied 
to the/ transformed model is now a consistent and asymptotically Efficient 

The procedure s-uggested in (25) is an application of the widely 
useful generalized least squares ' (GLS) approach, to estimation. The 
, application' of, GLS to- pooled mode 1^' is commonly advocated in the 
econometric and biometrlc literatures jSer love , 1971 Sear Xpt 1971), 

Since we will make continued reference do the GLS estimator we need 

/ A , " ; ' \ ^ ( 

a somewhat more formal representation. The GLS estimators xs -defined as 



(26) 



where 



7 



Q 



•1/ 



s 1 

- i 



0 
0 



and (cf. Hannan and Young, 1974) 

A" 1 = (;/' ). K,. - 11' /T) + (1/5)(U'/T) I (27) 

where 7| - (1 - P) + TP , aund' I is a ,(T x 1) vector of ones. 

The form of GLS transformation (27) can be intuitively motivated 
as follows.' The peculiar feature of pooled models is the use of both 
cross-sectLonal (be twean-un it ) and longitudinal (within unit) variation 
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Co estiffifta causal parameters. The richness of^the data presents ah 
implicit ehaiee, how to weight one type of variation relative to another. 

Generalized lea,st squares, uses P to weight the two types of information. 

■ ' * I • - ■ -1/2 

To see this'*] eons tder^theXcase where^ p - 0, Then £ and^obse^/a- 

titons ire transformed *in (25), by an identity transformation. GLS reduces 

- ;V; if > • ■ - V * " 

to OLS where cro s's - s#£ t ion a i and time series variation "are weighted 

[.-• • . % . 

propor t ioriate ly to N and T (see Maddala, 1971), At the other extreme, 

U 4 -1 ' 2 v ■ ' 

\when P - 1^ $ = 'll'/t ■ This transformation averages observa- 

va\ions over t/Lme for each unit. The result is a regression \ 
on grouped observations where all of the weight 'is placed on cross-*: / 
sectional variation. In' cases where p takes on a value OSP £1 3 GLS ^weights 
time series variation inversely to p. Such a weighting sjeems appropriate 
since p measures redundancy in the time, series , ■ The mo^ej redundancy, the 
lower the weight attached to longitudinal variation. ^ ?l ' . 

So far we have treated p as known a priori. But we know of no s . 
realistic cases where sociological researchers have prior knowledge of the^ 
value of ;P- Thus we consider methods of estimating p and properties ^ 
of generalized least squares estimators thac use estimates of p. * 
The most widely used procedure for estimating P uses the fixqi^- 
effecCs estimator discussed in the previous- auction. The results of L5C can 

I ai 

- E (m* + E S/N)/N (28) 
i-1. 1 i=l 1 / * v ' 



[ 2 

be. used to calculate p as follows. 1 To estimate 3 we need an ^stimace of 



Neriovo (19 71)* suggests 
2 



m 



\ 



I 



19 



An obvious Estimator of ^ m * + G * is •thdysum of squared residuals from the- 
LSDV regression divided by Nt|, Then ^ J 

Nerlove chose '"*P irr (12 .297 over a rf maximum likelihood estimate Co 

i 

avoid negative values of P (which are impLausiblef in most applications)* 4 

; * - -\* < 1 , , ' " . 

Unfortunately the estimator ift (12,29)* is upwardly biased (at lea^t tr^ 

small samples) with the magnitude of the bias ■ inversely related, to p. 

( * * * * 

v - Recall that GLS requires Consistent estimates of P. 'The bias in^ 

*■ * ' * 

p does not, however, appeaif to unduly damage the resulting GLSj estimators 

I A 

( Amem iy a, 1967), We study this issue fur titer below. To acknowledge 
the fact that we are using estimates^ of P 4 rather than the true values, 
it is more precise to ifefer to .this estimator/ ^ 

as modified generalized least squares* (MGfcS) . . , 

* This estimator is consistent and asymptotically, efficient even though 
'it uses biased estimates of p.' All that is required for these large 
sample properties is that 3 be a, consistent esqimatdr of P . (Aitken 1934). 
Empirical researchers .are often more concerned with the behavior of estimafc 
in small or moderate size ^samples . And, the bias in P may' be- damaging in 
such samples, We, report results on small sample properties below. 



Finally, We may foFm maximum 



likelihood estimators for. the random. 



effects model. Since the are joint normally distributed, this 



lis } 

* * I 

xnts tcy a standard Mfc regression problem. Estimates of a , b , , . . . 

* 2 _ ' _ 

o J\and P may be found by maximizing the log likelihood function 
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2n - v l/2flog f 0,4 - t/2 u' Q^u . , . : (30) 

., 2 >- • -'' * '. * ', : ' • - ~. \ 

sifjce both <J aiijd'.'p ^feust be hon-negat ive ,J»e may maximize (30) .subject 



;■ . ■ y -.^ ■ Y . » - .) 

4 ' The JMLE (bc*trh unconstrained. 4nd coti^jrain^d^ ha^fe the ^ood larger 

£J • * " *~ ■ ; ■ ■ ".">«*■ ; 

sample (properties (consistency, efficiency) d iscussed iA earlier applica^ 
<?*^ions* However,^ unlike cases discussed t^o this point, ML is not 

identical forums case wi^h the best* least squares estimator, MGLS. 
^ There are . three reasons why the two esti^itors will differ. First, least 
- squares and ML estimates of variance components differ. 

Second, ^pLS is a two round procedure while ML estimates all parameters 

simultaneously. Finally, \s there is no c los ed - fjorm 'solution to f<3Q) 



MLB are' foun^ by iteration. Thus the numer ica 1 * va lues 1 of MLE depend as 

■ " § : X ' 

well on the shape of /the 1 ike 1 ihood * func t ion and the quality of 

the iterative procedure.^ x * . ■* - 

' ' v } ' 

Thus there are two major alternative approaches to the estimation^ 
of dynamic parameters in models wLth random Unit-specific effects: 
maximum likelihood and generalized least squares. There is actually a 
third estimator chat might be considered* The fixed-effects LSC 
estimator is also consistent and asymptotically consistent for ttm random 
effects model £Amenyiya 1967), Of the three, ML is preferred in large 
samples for reasons discussed earlier. It ains minimum variance 
properties under the non= linear transformations required to go from 
integral i;o .4 if f eren t ia 1 equations, But what abou^t smaller samples 3 



Throughout" the discussion we have relied on large-sample theory^ 
As we mentioned earlier, it-is important for empirical researchers to 
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obtain some- information Jabout the behavior of such estimators in sraali and 8 
odarate si^ed samples* Two issues are important here r we want compare 



4 d 

' i 



the efficiency ^of^fchi jarious consistent estimators in finite samples, and 

■ ' v - 

- we also want to compare the performance rjf the consistent estimators with - 

those of inconsistent estimators (OLS for example) which may "have smaller 

mean, squared .error in small samples (cf, Hurd s * 1 97 2^ , We have no*t yet seen 

analytical results on these issues- So we consider the results of Monte 

Car lb experiments *on the ^mall sample properties of the various estimators. 

9 ^ Monte Carlo Studies of Small Sample Propert_les 

r ? " wb ^summarize results from two simulations that used the same structure* 

%* The two studies partially overlap but al^o ftudy some different estimators. 

We concentrate here*- on the similar cases so as to give an overall comparison 

o<f all the' estimators under consideration, For more details see Hannan 
• » i 

' and Young (1974, 1977.) and Tuna and Young ( 1976). 



Data Generation, Both studies generated data that fits the following 



mo de 1 : 



Y . - y Y , + y,X. + u. (3.1) 

it i i, t-1 2 it it 



i . , - m + E 
ik i it 



where the components of u^ c have the properties stated ih Section 8, The 
exogenous variable has the structure" 



X = 0 . 1 1 + 0 . 5 X , + w 
It " "l, t- 1 it 



where the w 4 are independent normal variables. In these respects the 
it 

simulations followed Neriove , s (1971) procedure* However, they differed 

/ 
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from Nerlove 's in four respects: 'First, we nave chosen the, number 



', of ^individuals N as 'fifty and the number'of time periods T as 
i five j .whereas Nerlove chose- twenty-five and ten, respectively. 

f ■ * : . 

We/ cHose the former values of N and T because they are representative of 

1 \ * * '\ > ■ • ■ • 

many available data 1 'gg ts, Seconds we have generated pseudQ-random 

variates by Marsaglia's reccangje-wedge-tail algorithm, = recommended' 

■ * ■' * 

as best by Knuth ( 1969), rather than the method described by Nerlove 

(1971), Third, we have studied somewhat different combinations of 
parameter values. In each combination we set a - 0,0 and o = 1.0. 
We selected five values for P ; 0.0, 0,25, 0,50, 0.75, and 0.90. To 
examine the dependence of estimator quality on uhe relative strength of the 
effects of the lagged endogenous * and exogenous variables, we chose three qq\ 
. tiohs of b and c: (b,c) - (0.3,1.0), (0.8,1.0), and <0. 8,0.5) . Thus, we 
examined a toMl of fifteen combinations of parameter values. Fourth , 
for each combination of parameter values we generated 100 sets of 

4 

data, where Nerlove generated^ 50 . The additional data sets 5 give increased 1 
confidence about the properties of estimators, £ 
Est imat ors ; J^e study the behavior of., the following t'imators ■ 
(1) Ordinary least squares (0LS). A consistent estimator only when 
P = 0, 

-(2) Least squares with constants (LSC) » the fixed-effects 
estimator. Consistent and asymptotically efficient. 

(3) "True" generalized least squares (GLS) using known (true) 
values of p* A minimum variance consistent estimator, 

(4) Modified generalized least squares (MCLS) with p calculated 
as in (29) from an LSC first stage estimator. Consistent 
and asymptotically efficient. 
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(ShMaximum likelihood constrained (MLG) with er 2ND and 0 < p £1. 
Asymptotically unbiased mnd efficient , * 
' (e*) Maximum likelihqod unconstrained (MLU) : asympto tidily unbiased 1 
but inefficient relative to MLC ^ 
An initial^set of parameter estimates must be provided to find the ML 
estimates in both methods (5) and (6 ) . We, compared the performance of 
two types of starting values for five different parameter combinations m 
(a total of 500 data sets) using unconstrained ML - the LSDV estimates 

and the true values used to generate the data. The two types of 

'? 

initial estimates produced netfrly identical final .est imates. for the 

r _. 
four comb in at ions in which P > Q, For P - 0 the two sets of parameter 

"estimates differed in only a hahdful of cases, and by a negligible amount, 
> Jpfferefore i because of the cost (nvolved in Draining the LSDV estimates, 
\we used the true parameter values as starting, estimates in all remainitt^^ 
^ ML estimations. We report only the results obtained from using this £ 
latter type of initial estimates, , . ^ 

Whereas Nerlove (1971) used the Fie tcher- Powell algorithm (1963) 
programmed by Wells (1967) to maximize. L 3 Tuma and Young (1976) used the Gill 
'^•M^rrey algorithm ( 1972) programmed by Wright (1975), Both algorithms are 
iterative procedures and are based on modified steepest descent methods 
of function minimization. Gill, Murray and their coworkers (1972a 5 1972b) 
have shown that the Gill-Murray algorithm converges more rapidly and 



reliably ffian the Fletcher- Powell algorithm. However, when hot 
converge, they report that the two algorithms give extremely similar 
estimates #E the function optimum for a variety of -functions . 

Tuma and Young's (1976) treatment of contraints on parameter values for 
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a 'and p departed* markedly ^ from Nerlove's (1971), Nerioye extfis trained or ■ 

■ « ' 2 

to be positive by maximizing L with respect to a rather than a * He /imposed 

a nonnegat ivity constraint on p by equating it with sih 9 and maximizing 

L with' respect to 9 rather than p. As Ner love^acknowledges 4 tftis method 

of applying consc^ints- causes L to have multiple maxima with respect* 

to 9 since sin 9 is a periodic function, Murray (1972*) warns against 

employment of trigonometric constraints, Such a procedure can* increase 

the nun linearity of the function being maximized and causb th; matrix of 

second derivatives, (which must be negat iye ' de f in i te at the maximum of the 

likelihood function.) to become singular or ill-conditioned. 

The Gill-Murray algorithm used by Tuma and Young (1976) for Mi. estima- 
tion utilizes a projection method of optimization that permits any feasible 
. ' \ ' 

equality or inequality constraints to be imposed on parameter values, 

For a detailed discussion of this constrained optimization procedure, 

see Gill and Murray (1972 ), This method does" not increase the non- 

lL^tarity o-f the function being optimized or the number of local maxima, 
" i -A - , 

To our knowledge there is no previous evidence indica ting/ the 

magnitude of the effects of constraining o and c on ML parameter 

estimates for the model we have simulated, Thus , we do not know 

whether the mean squared errors (MSE'-s) of the constrained estimates 
2 

of p and a~ will be appreciably smaller than the unconstrained 

versions. Further , we do not know the Effects of constraining^ 

2* " 
and c on the quality of the estimates of and ■ Finally, it is 

important^to learn whether the poor performance of the 1 >£L method in 

Nerlove (19H) results from the small-sample properties of ML estimation 

of this model pr from the implementation of parameter constraints. 



f Results : Before looking a: mean, squared • error,, and bi^s of» - , .± 3 

K estimators we comment on the effectiveness and practicality c-f the 4 

maximum like^ihoo4 ptoeedure used. This issue has heightened importance 

in the present context as Nerlove (1971) .in a very influential paper \ * 

reports, that MLE failed to converge on most occas ion's * and thus did 

nqt stand as serious-practical "alternatives to MGLS. Tung f and Young ( 1976^ 

that implementation of the ML methods was both successful and practical* 

Not only did ML estimation converge to a solution for every data set, but 

also the time required for this was short. On the average the ML solution 

was found in four to ten iterations ? depending on the particular p 

combination of parameter values, , The MLC and MLU methods required 

N 

nearly identical numbers of iterations to converge. For both methods 

several more iterations were usually needed for high values of q % 

d specially H wHen ( = O.B^ y^ m 0,5 ), These higher numbers of 

iterations occur together with poor quality of the ML estimates of 
2 

a and p, as de scribed "more fully -later in this section. It is helpful 

to know which parameter combinations led to acei vation of constraints. 

Obviously for the cases in which no constraints were activated , the ' 

2 

MLC and MLU estimates are identical. The, constraints that a be 
positive and that p be less than o,^ equal to one were never brought i 
into play (cf, Nerlove 1971), 'rioweverj the constraint that p be 
nonnegative was activated in about sixty percent 'of the cases in wi^ich 
P - - 0.8, y 2 ^ 1.0; Ind p - 0.90, Thus the quality of MLC 

and MLU estimators is unlikely to differ except for these combinations * 
of parameters, « • 



2 , • 

We begin our assessment of the quality of the stimators by eompar- 

ing overall mean squared errors averaged over the five choices of P* 



These are reported in Table 1. Overall the OLS and LSG estimators 



Table 1 about here 



are inferior as we expecLed* Th^ simulation results agree with thti 
large, sample theory in that OLS has largest MSE in each ease. Moreover, 
thes^ two estimators are notably poor in estimating y- , and, as we have 



noted *repeatediyj this failure has serious consequences in analysis of 
continuous - 1 ime models, ■ On the basis of these results and the further 



^V^dence in Hannan and Young (1977) we advise against use of OLS and 

1 - , . J 

LSG for random-effects models/ Henceforth we direct attention orrl'y 

. \j 

to the MGLS and ML estimators, ' . ' 

The relative quality of the ML and MGLS estimates varies according 
to the size of the, ratio of Y^ the coefficient of the lagged endogenous 
variable , to Y 2 > the coefficient of the exogenous variable. We find 
that ML is s superior when the effect of the lagged endogenous variable 
is- small in comparison to the effect of the exogenous variable , while 
MGLS is best when the opposite is frue , As we report below, the 
dependence of the relative quality of the ML and MGLS estimates of- 
regression coefficients on the relative effects of and Y^ becomes . 
even^more apparent when, the simulation results are not aggregated 

ov^r values of p. 

, r . _ - 

- We now turn our attention to a more de tailed examination or the 

performance of the MLU and MLC estimates, contrasted to each other 
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mhd to the best of the least squares methods, MGLS, We use the r . 



measure ; 



%"bias(Q) - 100% * bias (8)7 ft.* 
For both a nd 73 the % biases of* the MLC and MLU estimate! are very 
similar across all parameter combinations (see. Tables 2 andl 3 
respect ive ly) . 



Tables 2 and 3 about here 



T -~~ ' Both the ML and MGLS methods display consistently iow 7 a biases 
in 1 2 across all parameter combinations,, Howe ve r^bo t h methods of 
estimation produce widely varying % biases in Y^ . For each combination 
of and y 2 the % bias in ML estimates of Y^ tends to become worse as 
p increases, However^for the first two combinations of and Y 
there is a downturn in -the % bias for very high values of p. On the 
other hand , the MGLS estimates of y^ are downwardly biased for low 
values of P but the % bias' increases monotonically as P increases , 
approaching a negligible % bias for P = 0,9. 

Of course, the MSE 1 s reported earlier also depend on the 
variances of -estimators ." However , there are only slight differences 
between the two ML and MGLS in variances- For both types of estimators 
the variance falls off sharply as P increases. As this is the only 
interesting pattern in the variances we do not report the actual Jiijpfras 
(see Tuma and Young 1976, Tables 4 and 5) . ^ 

Finally we ifcok at estimates of P. In both MLU and MLC estimates 
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of p the biases are usually negative and very similar (see Table 4) 



gar 



Table 4 about here 

The magnitude of the bias in P is somewhat smaller for MLC than for 
MLU when m Q*8 5 y % - 0,5, 0.25 < p < 0.7^) but slightly large 

for MLC than for MLU when p - 0,0. The size of the bias in ML 
estimates of p tends to increase as p increases; however, for two 
. of the three combinations of the regression .coefficients , there is 
t a downturn in the bias in "p as the value of p become^ very large, ' 
The ML and MGLS methods perform optimally at opposite ends of 
p continuum, Whereas ML estimates of p are almost always downward If 
biased, the MGLS estimates of p are almost always upwardly biased* And, 

as we found in our examination of the % biases of Y* the performance 

./ 1 

of the ML method tends to be best when MGLS is at its worst, and vice 
versa* Thus, we find that while the bias in ML estimates of p is 
greatest for high values of p and least for low values of p, just the 
opposite is true for MGLS, The MLS estimates of p are most biased 
when p is near zero and least biased when p is near unity* 

Nonetheless , the ML and MGLS methods have two obvious similarities: 
(1) there is an inverse relationship between % bias in Y^ and bias in p 9 
and (2) absolute values of the biases in Y^ and P are positively 
associated. These similarities -are curious becuase the ML and MGLS 
methods have opposite signs to the biases of their estimates of p and 
of y n . Though the. two methods differ-dramatically in their tendencies 



t 



to attribute stability in the dependent variable to serial correlation 
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of residuals for individual units rather than to Inertia in She 

v * - - 

dependent variable % for both methods there are compensating affects, 



That lip for 1 both methods error in one . direction in estimating 
the strength of serial correlation of residuals is accompanied by 
error in the opposite direction in estimating the strength of the 

. .* 
lagged endogenous variable. 

He conclude that both ML and MGLS perform relatively well with panel 

data of the si^e usually available to sociologists (N = 50, T = 5), 

They clearly outperform 0L5 and LSDV. It appears that MGLS does best 

when is small. This implies ^that MGLS has best small sample properties 

when systems under study adjust rapidly relative to the time scale chosen 

(or, under the alternative interpretation f have strong negative 

feedback On the other hand s MLE appear preferable for systems t » 

that adjust more s.lowly . In light of previous work on these issues , 

perhaps the most important cone ius ion rs that both ML and MGLS are 

practical and appear to have good small sample properties. 

We also provide at least a partial answer to the question: Should natural 

constraints on parameters be imposed? Tuma and Young (1976) find, as 

did Nerlove (1971), that* in practice only the rionnegativtty constraint 

on p is at issue because' other natural constraints are never' violated. 

These results' show that in terms of the mean squared error of 

and y ML estimation with constraints on p has a slight advantage 

over that without constraints. Clearly constrained ML 'estimation 

gives more reasonable estimates of p because it prevents p from 
having a negative value, which is contrary to the assLmptions of 
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the'modfil* In addition* the constrained ML estimates of the regression 
coefficients always have a smaller variance than the unconstrained ones / 



and this usually compensates , for oceabioually larger biases in the 
constrained estimates,. Still, the differences between the constrained 
and unconstrained ML estimates are never large^and always negligible 
for those parameter combinations in wnich ML estimates are superior in * 
quality tO'MGLS estimates. Consequently j s this research provides no '* 
evidence that omitting constraints on c will seriously damage the 
quality of ML estimates of regression coefficients in the model* 

10* Unequally Spaced Observations 

To this point we have assumed equally spaced panel observations. As 

long as waves in the panel are repeated with constant period for all 
units, several approaches" to estimation have merit. We saw that two 
broad strategies have been proposed, Within each strate^v, several 
estimators have good properties, But once we venture beyond this 
standard design to consider unequal spacing, we face greatly limited 
alternatives. In: fact only one strateg^ and one estimator appear - ^ 

feasible: , maximum likelihood applied to , integral equations. 

Two classes of designs may yield unequally spaced data. The first 
is the conventional multi-wave panel where., the length of lags between 
waves varies but is the same for all units. In field research such 
variability in the timing of waves -may arise from the vagaries of 
flows of = research funds 3 problems of entry into sites, renewed 
interest in some earlier panel, etc. For .ex amp Lc , Meyer's (1975) 
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three-wave panel of finance agencies has a three-year lag between the 
first two waves and a six-year lag between the second and the third. The 
widely analyzed Sewell (see, for example, Haider and Sewell 1975) panel 
of Wisconsin high school seniors was interviewed in 1957s 1964, and 1976* 
Exactly this same sorts of problems arise in archival research since 
official .soiJkces often release data at intermittent intervals. Moreover, 
researchers using secondary sources .must often depend on the timing of 

several scholars or groups ofr scholars, ^They are this often confronted 

f ■ ■ ■ ■ " .. N ■ ■■■ ; 

1 with unequally spaced data.; * " * "> •• ; 

The' second 3 perhaps more Important , problem concerns, timing that 

Varies from unit to unit This problem may also arise for the reasons 

discussed above. Some individuals may be "lost" to a panel and only 

recovered at some later time, However, there is a more systematic % 

** ** • * 

reason why rlie timing of observations may vary among units. Panel ob-* 
, servations may vary amon^g units. Panel observations may be linked to 
events that are generated by a stochastic process, Sometimes this is 
done within a retrospective design. .For example, the Parnes (1975) "mature 
woman panel 11 contains work , histories at marriagej at first births etc, 
Since different women, have different timing of events s the panel will 
have extreme unequal spacing,. 

We argue that sociologists ought to study coupled changes in 
qualitative and quantitative outcomes (e.g. marital status and earnings)* 
One fruitful approach to such systems involves studying changes in 
quantitative variables over periods that begin and end with events 
(changes in state or qualitative variables) , We surmise that, if- we 
are to make progress on the important class of problems that Involve 
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Coupled changes in quantity and quality, we must solve the .problem of ■ 

. * -3- • • • .; * .. • 

analyzing unequal spaced panel data. » 

The flrstS^ype of unequal spacing J.S usually di-alt with by' 
analyzing pairs ^ waves separately. But this is 'an unsatisfactory 
solution in many instances since it obviates the possibility of adjusting 
for "tint-specific disturbances, In most sociological applications,* such > 



a failure makes a real difference .in estimates of : the parameters, of the - 
underlying continuous- time model from different lags. Should we be 
tempted as a consequence to treat the data as generated by several 
discrete-time processes with different lags, there* i|s another problem. As 
we. showed above, " there is no metric available to compare results 
from different lags when the process is viewed as discrete in time. 
Thus the analyst cannot draw sound inferences about stability or change 
in the process. He has not one but two or more processes. Thus there 
is a tremendous loss of generality. So the analyst with panel data 
with the simpler form of unequal spacing faces two unhappy alternatives: 
report different estimates of one process (with the fear that the 
differences reflect only autocorrelation bias) or report estimates 
of several dis*crete-time processes for the same substantive problem (where 
the lag structures are determined by the peculiarities of the research 

• \ V . 

design) . 

We have not seen any analysis of data of the more extreme type of 
\ . . • 

unequal spacing that pays attention to these methodological problems. 

Moreover, we-have not yet found any systematic treatment of the general 
f 

problem. So. despite its obvious practical importance and its possible 
substantive importance, the issue of how to estimate 
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models from unequally spaced 1 data has received surprisingly little 

* * 
attention, We attribute this lacuna to the - common preoccupation with 



discrete- time models in the social sciences , We now show, that shifting 

• ; ; J ' • ■■ " - 'V 

to a tout inuous- time perspective suggests solutions to unequal spacing 

. - , , , # * ? . 

problems, # 
10 # , MLE for Unequally Spaced Observations 

The simplest case concerns the linear stochastic differential/' 
equation with no, unit- spec if ic |r components : 

/ dY(t) = a dt + bY(t)dt + cX(t)dt + ■ ad8 . 

where B t is a normal Brownian motion, (1) has solution (subject to initial 



conditions Y(t () ) - Y Q . X(t Q ) = X Q and assuming X(t) changes linearly over 
At) 



1 

b(t-s), R C33) 
e dp 

s 

0 



Because A varies, qach unit has its own set of parameters in the integrated 
form. Let us rewrite (33) as 

H ' + "^&. °* *' 4l ' SW (34) 

* 

* 2 ' 

We know that S; is N(0, o\) where 

* 

2 * 

2 .a ,.' -2bAt 
o, = "777" (He i) 
i 2b 

Consequently we may write the likelihood function 

■ • 1 Tii ■ 
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,< -/.(a.b.c.ff; data) - II axp [ tt— 2 1 (35> 

1-1 /2TT0- f °i ' : 



_or~ 



If , N e 2 

'log*- 1* I log— b=L 4- I -i-2 (36) 
i^l vf^~ i-1 °i ' ■ , • 



where Is defined in (35) - But under, the model, p- 

H 2 - •[I 1 <t)--«2-.bi» M -.e 1 *^-d 1 *AX(t 1 ) ]l • <3?) 

A ^ ^ ^ 

knd, see (32), a , s b . * c * and d, are explicit functions of the dynamic 

I I 1 1 • • 

parameters of interest* * 

Since the At , are observed data, this likelihood may be maximized with 

i ■ 

respect to a, b 5 c 3 and cr.-* This requires writing out first and second 
derivatives of #(36) with respect to these parameters and using these 
expressions in one of the standard iterative routines, We have adapted the 
G ill-Murra^^algorithm^ used to estimate rates, for this purpose. We will 
illustrate the procedure below* . - 

Suppose one has reason to believe that each unit changes in res- 
ponse to unobserved constant factors as discussed above . Then the model is 

dY.(t) = ,adt 4 bY(t)dt + eX(t)dt + m dt + crdp (38j 

and subject to the same conditions stated above, has solution 

T«0O -a(e^l - 1) .+ 6 Mt i Y. 0 * c(e b ^i - 1)X^ ■ 1 , (t 
l - 1 b 10 b b l h.\t- L i- L 

t , 

|f +a(a bMl -l)+aj l ^V! , dP t . . (39) 

or Y ](; .) a* + b* Y. Q + e*X.Q + d* dX.Cc.) ■ \ (4Q) 

t 

b^= t c i 

where ,,u,(t) = r.:; (e *" 1 - TO + ? j 



C 

bit,- , „ 1N . , ri' bCt..-s) J(= 
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One cannot identify cr2 from only two waves of observations. However, 
when three or more Jwave^ are available on each unit ^ all -the dynamic 
parameter's may be ident if led 9 • 1 

Let us consider the general case where the number of observations 
varies from unit to unit . We denote the number of observations on unit i 
by I. and let £t . t = t , ._ - ' t , . where t.« is the jth observation on 
the i^ unit. Then we write a pooled model as follows: *~ 



'11 



12 



IT, 



NT. 



N 



bAt, , 
m 11^1 



bAt 19 
e "-1 



bAt 



4 



e W 12 Y 



10 



11 



+ c 
b 



10 



Ce b Wl)X Hi T -l 



e ba H-l 
flt u 



-J 



1 ax u 



'11 



(42) 



35 



— - 
Where 



'11 



c u 



12 




»1 (e bAt U-i) 



™L (e bAt !2-l) 



( e bAt N,T N -l) 



'11 



12 



And, "disturbance", vector of the integrated form is N(0, £) with 



Z = 



Vhert j_ s a by T, mcn-rix with 
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T 2, bAt, 
m 





y . ~ 36 




o 2 (e bAt il..l) : 




r. 


2b 




■ f « 




s 

N ': 


\ 








il-X) . .". .. ' .. . 


. o- ? n- ( r 2bAt L> 

V 2b 

6 ' 


s 



it- . " . 

Now £ may be eKpressed ai" a function of observable variables Y-i? . v 
X ^, AX as above, Since £ is normal , we may write an explicit likelihood funct'ioi 
and estimate parameters by standard iterative algorithm. Rather extensive ^ 
programming ^is required, however* to make this scheme operational*^ Our 
research group is currently conducting this work in preparation for application 
of these methods to sociological data, ' v . /. ■ - * 



11 . Conclusions 



The thrust o.f this report has been to show that we can use.. available * - 
methods # to solve many, of the practical problems* that arise in apply continuous- 
time, 'con tinuoqs^-'s tat e models, in sociological research. In part we have 
shown that sociologists have begun ta estimate implicitly (by use o £ normal ; . - 
theory assumptions with deterministic , models) linear change, models driven by | 
Brownian motion* . Only a slight change- in perspective is required for the * 



usual estimates to be transformed into estima'tes of a 'simple probabilistic 
model for change In quantitative variables. 



We have devoted considerable attention to the likely problem of 
autocorrelation. We suggest . that^ a^ combination of Browniah motion "/ 
distur^anceg and unit-specific permanent effects may apply meaningfully" 
to a variety *oi ^sociological analyses. If so, modest extensions of availabl 
estimators for pooled cross-section and time series ^4jj a mB ¥ ^ e u ^®d ] 
profitably. We showed that both generalized least squares a^d maximum , 
likelihood estimators have good properties for sample sizes typically us^d , 

' ^ , * . ■ ■ =;■■';•," 

by sodioibgists . ^cS? ' u 4 

Finally, we illustrated one of fthe major advantages of eonjtinuous^time 
modeling of social processes* the ability to handle data collected with 
unequal spacing in tim^a. The maximum likelihood estimators we discuss may 

f*' _ . y : ^ - 

be extended to this case «ln a straight forward 5 though "tedious, way. 
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FOOTNOTES 




Such sttidles are mo sr^ common in the engineering literature, wKere t"M |erm 

"filtering" is used j(n^tead of "estimation" — see Jazwinski (1970) for. a 

■ -" , . ■ • • " > ' "■■ . / " , * ' . . ■. 

review. . V- f ' ' * ■ . / . 

'• ■ : V ' - - ■ ' V " ^ . ■ • ;V ." 

Th# tablet that follow contain eKcerpts from various tables in Hannan and 
. Young , 01977) and Tuina an^; Young, X 1976) . Both 1 ; reports, contain cpnsiderable 
additional detail.,- ■< . .'" ' ■ ±y , - • ' 

g in many case s, both q ua nt.itat ive ' arisl qualit at ive u t c ome s will* be measur ed 
in the same interviews* y' Singer "and" Spiletman (1976) have^f demonstrated 
that .panel studies of discrete- ^i. e-. , qualitative) stochastic processes 

should not use a constant lag between waves* Wut should be irregularly, .■ * ■- 

^^^L I f ' . . - 

spaced, t If this advice is followed^ the quantitative records from such 

i« \ ■ t ■ . T * 

nterviews will have the structure we discuss here. 





1' 



ERL<T 




40 



Amemiya, T, ' • "' ■■; ■ 

' . 196^ "A notion the /est imat ion of Baiestra-Nerlove models.' 1 Technical 
Report N'ci, . 4, , Institute for Mathematical Studies in the Social 
Sciences Stanford ■ University. 
Balestra, P, and M. Nerlove -.' - 

1966 "Pooling cross-section and time series data in the estimation 

'n't'' 

- - : of a dynam±t?-. L model: the demand .for natural gas," Econometrlca 
* 34: 585-612, 7 
Bergstrom, A, R. (ed. ) , 

1976 Statistical Inference in* Continuous-Time Economic Models , 
Amsterdam: North-Holland. 
Coleman, J. S . 

1968 "The mathematical study of change,:' Pp. 428-78 in H. M- 

BJjalock, ' Jr. and A, Blalock (edsj, Methodolo gy "in Social 

% ■ • - tti 

Research. New York: McGraw-Hill, 

1: - — ■ • 

Fletcher,' P. ahcLM. j. D, Powell 5 , ■/ 

196 3 "A*;rapidly convergent descent method for minimization . " 

-.v. , 
Computer JoQrnal 6: 163-B8. . f 

Gill, P. F. and W. Murray s 0 [■ ; 

" 1972a "The implementation of 'two 'modified Newton algorithms for 

unconstrained optimization." National Physical Laboratory 

Report NAC 24. , . . 

197 2b "Two methods for the solution of linearly constrained^ .and un- 

constrained optimization problems.* National Physical Laboratory 
* Report NAC 23. 6 ' ■ . 



40 



I [aim. in, M. T. 

1978a "Models ^or change In quantitative variables, ' {Jarc I: 

deterministic models," Technical Report No* 63, Laboratory 
Jfor Social Research, Stanford University* 
1978b "Models for change in quantitative variables, part II* 

stochastic models," Technical Report No, 64, Laboratory 
v» for Social Research, Stanford University. - 

Hannan, M. T. and J . Freeman ■ 

1978 "Internal politics of growth and decline,-!; Pp. 1 77-99 in 
M. Meyer et al* (eds, ) , Environment and Organization , San 
Francisco: Jossey^Bass* 
Hannan, M* T. and A, A. Young 

1974 "Estimation of pooled cross-section and time series models* 

preliminary Monte Carlo results*" Pdper presented at Conference 
oft Policy Research in Education' Methods and Implications, 
University of Wisconsin, Madison, 
1977 "jpstimation in panel* models : results on pooling cross-sections 
' and t ime series*" Hp: 52-83 in D. Heise (ed*h ^penological 
-. ■ Ijfe'fchod o logy 1 9 77 , San Francisco^ Jossey-Bass, 
Ha user 5 R, M, and W, Sewell 

4975 . Educajtion^ Occupaticmj and Earnings . New York: Academic Press* 
Henderson, C, R* 

1952 "Specific and combining ability." Pp„ : .352-7Q in J. W* Gowens 
(ed * ) / He t eras is ^ Ames , Iowa : Iowa* State College Press, 

Hurd," M. D. 

,197 2 "Small sample estimation of a structural equation wtth auto-* 
correlated errors," Journal of the America n Statistical 
1 Association 67: 56 7 — 7 3 • 

" " ■ 42 " • " 



Johnston* ' J • ' ' 

1972 Econometric Methods , 2nd edition* New York* McGraw-Hiir* , 
Knuth, D. E: ,, t , : 1 

1969 the Art of Computer Programming, Vol* 2,' Cambridge, Mass, i 
Addison-Wiley * 

Kuh, F, . , 

1959 "The validity of cross-sectlonally estimated behavior equations 

i \ , 

in time series applications," Econometrica 27 : 197-214^ . 

Meyer, M. p 

1 1975 "Leadership and organizational structure*" American Journal 

of Sociology 81: 514-41. 

Murrary, W. (ed . ) 

1972 Numerical Methods for Unconstrained Optimization * New York: 
Academic Press. 

.Neriove* M, 

i 

1971 "Further evidence on the estimation of dynamic economic 

relations from a time series of cross^sec tions * " Econometrica 
39: 359-82. 

Fames, H. 5, 

** • ."...= 

1975 "The national longitudinal surveys" niew vistas for labor 

market, research," American Economic Review 65: 244-49* 
3earle s 5-^ R, , 

197# "Topics in variance components estimation," Biometrics 27: 
1-76. 

Singer, B , and S . Spilerman 

1976 ythe representation of social processes by Markov models," 
American Journal of Sociology 82: 1-54 * 



43 



42 



Turns, N, B * and A. A* Young 

* 1976 "Constrained and unconstrained maximum likelihood estimation 
of a variance components model of cross-sections pooled over 
time, 11 Technical Report No. 60, Laboratory for Social 
Research, Stanford University, 

Wells, M. *v 

1967 "Function' minimization/' Algorithm 251 in Collected Algorithms 
from CACM , New Yorlu Association for Computing Machinery, 

Wright, M. 

1975 "LCMNA: a set of FORTRAN subroutines for minimization of a 
set of *non=linear functions subject to linear inequality or 
equality constraints," Stanford Computation Center Documenta- 
tion, Stanford University, 



44 



Table^ I 



Mean Squared Error of Estimates 



(Cases averaged^over all* values of p; each 
entry based oft 500 sets of estimates) 









*2 


b - 0.3, c - 1.0 








OLS 


6.449 




,348 


LSC 


,822 




,239 


MGLS 


' ,226 




. 194 


EC 


. 169 




, 182 


MLU 


. 175^ 




. 182 



M1,C 



b" = 0.8, c* = 1.0 

OLS i 1.592 ,341 

LSC .748 , .228 

MGLS .146 .198 

HLC -720 .199 

MI.U .722 .199 



b = 0.-8, c -.0.5 

OLS 2.420 .228 

LSC 3.86 5 .220 

MGLS -925 ■ ,194 



2,352 ,208 



MLU 2.415 ,218 

All entries in this tabls hav-j b-jcn multiplied by 10 



Table 2 



V 

Percent Bias in 1 



(Each entry based on 100 sets of estimates) 



9 m 



y x = 0.3, y 2 

MLC 
MGL^ 

, MLU 
MLC 
MGLS 



* 1.0 



* 1.0 



MLU 

MGLS 



0,0 

0.0%* 
-1.2 
-22.4 

0,2% 
-0.8 
-7 .,3 

- o. h \ 

-1.5 
■19.0 "> 



0,25 

\ 

3.7 
3, 7 
-12.2 



0.5 

3.9 
■3.9 
-5.9 



9.3 13.6 

9,2 13.6 

-2.4. .3 

16.7 21 .4 

15.8 21.0 
■12.4 -7.7 



0.75 

2.3 

2.3 
-1.5 

14.2 
14.2 
1.9 

23.4 
23.3 
-1.8 



0.9 

0.9 
0.9 

-0.1 

6.4 
6.4 
1.7 

23.9 
23.9 
1.9 



All entries in this table have been rounded off to the nearest tenth 
of a' percent . 
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Table 3 

Y 

Percent Blag in 2 
Percent Bias 



(Each entry based on iBo sets of estimates) 



P - : 


0.0 


0.25 


Q.5 


0.75 


0.9 




Yj « 0.3, y 2 = 1.0 
















-0 . 0%* 


-0.0 


0.1 


0.1 


-0.0 




MLC 


0.1 


-0.0 


0.1 


0.0 


-0.0 


















MGLS 


-0.3 


-OU 


-0.3 


-0.2 


-0.1 





Y. - 0.8 S T = l.O 



MLU 


0,0 


-1.6 


-1.6 


-0.4 


0-.4 


MLC 


0.1 


-1.6 


-1.6 


-0.4 


0.4 


MGLS 


-0.1 


-0.3 


10.2 


0.0 


0.0 



Y = 0.8, y = 0.5 
1 2 



MLU 


0.0 


-2,5 


-3.1 


-2,8 


«1.5 


mlc 


0,2 


-2,0 


-2,6 


-2,6 


-1,5 


MGLS 


-0.9 


-1,4 


-1,1 


-0,5 


0.0 


All entries 


in this table have 


been rounded 


off to 


the nearest 


tenth 



of a percent 
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Table 4 



Bias or p 



(Each entry based on 100 sets of estimates) 



P = 



0.0 



0.25 



0.5 



0.75 



0.9. 



Yj - 0:3£' Y 2 



MLU 

MLC '•• 
MGL3 



= 1.0 



.006 
,017 
.254 



.040 
,040 
215 



,048 
048 
145 



.025 
,025 
,064 



.008 
.008 
,021 



Yj « 0.8, y 2 - 1-0 

.., , MLU 
" MLC ■-■ 

MCLS :' '■: 




.007 
.0,16 

325 



.005 
.017 
,445 
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, 169 
,166 
,320 



.273 
,242 
,477 



.345 
.344 
,219 



.519 
..491 
. 340 



-.398 
.398 
.092 



.728 
,718 
,156 



,iof 

. 101 
.027 



.766 

.766- 

.047 



V* 



